Lab#7

Noam Yan



I choose high-pt data ranging from 1000 to 1200.

1.

This is how we calculate the standard deviation, so they are expected to be equivalent.

2.

From the distribution of mass, we know that QCD is uniformly distributed. To opmitmize the expected significance, we need to find the peak of higgs data.

3.

SET A

The plots above shows the distribution of each feature. Through comparison between two datasets, we can have a rough understanding about which feature can help us distinguish them. The more disparate are those two distribution, the easier we can discriminate by such feature.

I'd like to choose t21 and t3 as discriminative features.

SET B

According to the plots, "mass" works pretty well. To improve the performance, I will choose "angularity" for further classification

4.

Lab #8

High-Luminosity Data

Let's visualize the highg-luminosity data in histogram.

Optimizing the data

I am gonna use the same condition in Lab#7: mass [125.6091423607632, 126.9275302079706].

Low-luminosity Data

Optimizing the data

I am gonna use the same condition in Lab#7: mass [125.6091423607632, 126.9275302079706].

95% Confidence Level of signal yields

The upper limit of observed 95% confidence level is lower than the expected.